IE evaluation: Criticisms and recommendations

نویسندگان

  • A. Lavelli
  • M. E. Califf
  • F. Ciravegna
  • C. Giuliano
  • N. Kushmerick
  • L. Romano
چکیده

We survey the evaluation methodology adopted in Information Extraction (IE), as defined in the MUC conferences and in later independent efforts applying machine learning to IE. We point out a number of problematic issues that may hamper the comparison between results obtained by different researchers. Some of them are common to other NLP tasks: e.g., the difficulty of exactly identifying the effects on performance of the data (sample selection and sample size), of the domain theory (features selected), and of algorithm parameter settings. Issues specific to IE evaluation include: how leniently to assess inexact identification of filler boundaries, the possibility of multiple fillers for a slot, and how the counting is performed. We argue that, when specifying an information extraction task, a number of characteristics should be clearly defined. However, in the papers only a few of them are usually explicitly specified. Our aim is to elaborate a clear and detailed experimental methodology and propose it to the IE community. The goal is to reach a widespread agreement on such proposal so that future IE evaluations will adopt the proposed methodology, making comparisons between algorithms fair and reliable. In order to achieve this goal, we will develop and make available to the community a set of tools and resources that incorporate a standardized IE methodology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of machine learning-based information extraction algorithms: criticisms and recommendations

We survey the evaluation methodology adopted in information extraction (IE), as defined in a few different efforts applying machine learning (ML) to IE. We identify a number of critical issues that hamper comparison of the results obtained by different researchers. Some of these issues are common to other NLP-related tasks: e.g., the difficulty of exactly identifying the effects on performance ...

متن کامل

Assessing allegations of domestic violence in child custody evaluations.

There has been an increased focus on child custody evaluations involving domestic violence allegations with much criticism of evaluators' training, practices, and procedures. A national survey of 115 child custody evaluators (doctoral and master's level) was conducted to explore these criticisms. Findings revealed adequate training, multiple sources of data collection, and practices/procedures ...

متن کامل

Sensitivity analysis of the 1998 Large Coastal Shark Evaluation Workshop results to new data and model formulations following recommendations from peer reviews

As a result of litigation, the 1998 Large Coastal Shark Evaluation Workshop Report was sent out for external peer-review. Two organizations, the Center for Independent Experts (CIE) and the National Resources Consultants (NRC), were selected to implement the review process, which resulted in a total of seven reviews. The present document addresses some of the criticisms and recommendations cont...

متن کامل

Evaluation of a nanosuspension formulation prepared through microfluidic reactors for pulmonary delivery of budesonide using nebulizers

This study aimed to determine the aerosolization behavior of a nanodispersion of budesonide, prepared using microfluidic reactors. The size and morphology of budesonide nanoparticles were characterized by photon correlation spectroscopy (PCS) and transmission electron microscopy (TEM). Processing/formulation parameters for formation of the nanoparticles were studied to determine their effects o...

متن کامل

KDOQI US commentary on the 2012 KDIGO clinical practice guideline for the evaluation and management of CKD.

The National Kidney Foundation-Kidney Disease Outcomes Quality Initiative (NKF-KDOQI) guideline for evaluation, classification, and stratification of chronic kidney disease (CKD) was published in 2002. The KDOQI guideline was well accepted by the medical and public health communities, but concerns and criticisms arose as new evidence became available since the publication of the original guidel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004